Pushing changes in progress for git-lfs support adding to entire ecosystem. #9766

GarryHurleyJr · 2024-05-18T02:52:26Z

Draft pull request. This is to keep my work in progress. The branch name will change before this PR is finalized. I will be adding code to this as time goes on.

…-core-for-lfs-as-an-option-for-repositories

bdragon

I added some inline suggestions and here's some general Ruby feedback. I've added links to the docs for our linter, RuboCop, which provide some examples and more detail. Not trying to nitpick, just trying to save you some headache with the linter and code reviews 🙂

Naming conventions: method names, method parameter names, block parameter names, and variable names use snake_case.
Redundant return. In Ruby the last expression in a block or method is the return value of that block or method. In these cases the explicit return should be omitted. It is common/idiomatic to use guard clauses in methods to reduce if/else nesting, so the return is obviously needed for those cases.
unless !!value is unnecessary. Just let the value evaluate as truthy/falsy. Hint: in Ruby false and nil are falsy and literally all other values are truthy. If you need to explicitly check for nil, use value.nil?.
You probably haven't gotten to this point in the implementation yet, but if you end up adding methods to a file that has the # typed: strict annotation you're going to want to add Sorbet sig blocks to the methods so the type checker passes. The syntax is weird so I would just look for other examples in the file/codebase and follow what they're doing.

bdragon · 2024-05-21T18:56:44Z

common/lib/dependabot/file_fetchers/base.rb

+        lfsEnabled = true
+        commandString = getCommandString(path,lfsEnabled)
+        /debugger
+        p commandString/
+        SharedHelpers.run_shell_command(commandString
+         ).split("\n").filter_map do |line|


I think this could be simplified a bit without the helper methods and by using Dir.chdir

Suggested change

lfsEnabled = true

commandString = getCommandString(path,lfsEnabled)

/debugger

p commandString/

SharedHelpers.run_shell_command(commandString

).split("\n").filter_map do |line|

lfs_enabled = # ...

git_bin = lfs_enabled ? "git-lfs" : "git"

Dir.chdir(path) do

SharedHelpers.run_shell_command("#{git_bin} ls-files --stage")

end.split("\n").filter_map do |line|

bdragon · 2024-05-21T18:58:10Z

common/lib/dependabot/file_fetchers/base.rb

+	      #the HEREDOC command will see any stray spaces.
+        return "git -C #{path} ls-files --stage" unless lfsEnabled
+        Dependabot.logger.warn("LFS is enabled in this repo.  Please use an LFS enabled client")
+        commandString = "CWD=\"#{__dir__}\";cd #{path};git-lfs ls-files --stage;cd $CWD"


I suggested something a little simpler above, but just adding a couple notes:

__dir__ is the name of the directory that contains this source code file, not the current working directory of the process

Rather than setting an environment variable for the directory, this is a good use case for Dir.chdir. This method changes into the given directory, runs whatever is inside block, and then changes back to the original directory.

GarryHurleyJr · 2024-05-21T22:26:33Z

With some discussion, I have a path towards getting this back on track.
detecting lfs

clone shallow
check for .gitattributes file
if not found, no lfs
if found and does not contain "filter=lfs", no lfs
else lfs

if lfs detected, use git-lfs
if no lfs use git

…fs-as-an-option-for-repositories' of https://github.com/dependabot/dependabot-core into GarryHurleyJr/6039-add-support-in-dependabot-core-for-lfs-as-an-option-for-repositories

abdulapopoola · 2024-05-29T21:23:42Z

common/lib/dependabot/file_fetchers/base.rb

@@ -890,10 +890,10 @@ def find_submodules(path)
      sig { params(path: String).returns(T::Boolean) }
      def isLfsEnabled(path)
        filepath = File.join(path,".gitattributes")
-        lfsEnabled = File.exist?(filepath) && File.readable?(filepath) && SharedHelpers.run_shell_command("cat #{filepath} | grep \"filter=lfs\"").include? "#{filepath}"
+        lfsEnabled = T.let(true, T::Boolean) if File.exist?(filepath) && File.readable?(filepath) && SharedHelpers.run_shell_command("cat #{filepath} | grep \"filter=lfs\"").include? "filter=lfs"


any reason for copying the same code pattern in 3 different files? Can we centralize it in one location?

abdulapopoola · 2024-05-29T22:34:55Z

common/spec/dependabot/file_fetchers/base_spec.rb

@@ -304,7 +304,7 @@ def fetch_files
    before do
      allow(file_fetcher_instance).to receive(:commit).and_return("sha")
    end
-    #start of lfs testing
+


Can we have unit tests covering these new scenarios?

jurre · 2024-05-30T09:32:15Z

common/lib/dependabot/file_fetchers/base.rb

+      rescue Exception => e
+        # this should not be needed, but I don't trust 'should'
+        lfs_enabled = T.let(false, T::Boolean)
+        raise e.exception("Message: #{e.message}")


We're rescuing the exception, setting lfs_enabled and then immediately re-raising the exception. What's the point of doing all of that? The exception will halt the code from executing, so all of this can just be removed and the end result is the same: an exception is raised.

jurre · 2024-05-30T09:45:30Z

common/lib/dependabot/file_fetchers/base.rb

-            git -C #{path} ls-files --stage
-          CMD
-        ).split("\n").filter_map do |line|
+        lfs_enabled = is_lfs_enabled(path) if lfs_enabled.nil?


Suggested change

lfs_enabled = is_lfs_enabled(path) if lfs_enabled.nil?

lfs_enabled = is_lfs_enabled(path) if lfs_enabled.nil?

Looks like we're calling .nil? on lfs_enabled before it is defined, what exactly are we trying to do here?

jurre · 2024-05-30T09:45:48Z

common/lib/dependabot/file_fetchers/base.rb

+      sig { params(path: String).returns(T.nilable(T::Boolean)) }
+      def is_lfs_enabled(path)
+        filepath = File.join(path, ".gitattributes")
+        lfs_enabled = T.let(true, T::Boolean) if File.exist?(filepath) && File.readable?(filepath) &&


Suggested change

lfs_enabled = T.let(true, T::Boolean) if File.exist?(filepath) && File.readable?(filepath) &&

T.let(true, T::Boolean) if File.exist?(filepath) && File.readable?(filepath) &&

No need to assign anything to this variable, we're just returning the results of the method calls

jurre · 2024-05-30T09:59:38Z

common/lib/dependabot/file_fetchers/base.rb

+      sig { params(path: String).returns(T.nilable(T::Boolean)) }
+      def is_lfs_enabled(path)
+        filepath = File.join(path, ".gitattributes")
+        lfs_enabled = T.let(true, T::Boolean) if File.exist?(filepath) && File.readable?(filepath) &&
+          SharedHelpers.run_shell_command("cat #{filepath} | grep \"filter=lfs\"").include?("filter=lfs")
+      rescue Exception => e
+        # this should not be needed, but I don't trust 'should'
+        lfs_enabled = T.let(false, T::Boolean)
+        raise e.exception("Message: #{e.message}")
+      end


I think all of this can be simplified to this:

Suggested change

sig { params(path: String).returns(T.nilable(T::Boolean)) }

def is_lfs_enabled(path)

filepath = File.join(path, ".gitattributes")

lfs_enabled = T.let(true, T::Boolean) if File.exist?(filepath) && File.readable?(filepath) &&

SharedHelpers.run_shell_command("cat #{filepath} | grep \"filter=lfs\"").include?("filter=lfs")

rescue Exception => e

# this should not be needed, but I don't trust 'should'

lfs_enabled = T.let(false, T::Boolean)

raise e.exception("Message: #{e.message}")

end

sig { params(path: String).returns(T::Boolean) }

def lfs_enabled?(path)

gitattributes_path = File.join(path, '.gitattributes')

return false unless File.exist?(gitattributes_path)

File.readlines(gitattributes_path).each do |line|

return true if line =~ /filter=lfs/

end

false

end

jurre · 2024-05-30T10:02:43Z

common/lib/dependabot/shared_helpers.rb

-      Dir.chdir(path) do
-        run_shell_command("git reset HEAD --hard")
-        run_shell_command("git clean -fx")
+      if lfs_enabled?(path) 


Why do we need 4 different methods in 3 different places to check if lfs is enabled?

Because we have three different files in common that each call the git command in a different way and none of them refer to the others. To make matters worse, some code never even bothers to use any of them.

Seems like the could all still share a single, shared way to check if lfs is enabled?

bdragon · 2024-05-30T17:06:28Z

common/lib/dependabot/shared_helpers.rb

+      T.let(true, T::Boolean) if File.exist?(filepath) && File.readable?(filepath) &&
+                                        SharedHelpers.run_shell_command("cat #{filepath} | grep \"filter=lfs\"")
+                                                     .include?("filter=lfs")
+    rescue StandardError => e


Jurre's suggestion is more idiomatic and won't incur the cost of shelling out, so I think you should incorporate that.

Just to call attention to it, shelling out to grep like this will return a non-zero exit if there's no match. SharedHelpers.run_shell_command will raise a HelperSubprocessFailed exception on a non-zero exit, but here we're swallowing virtually any error that could occur and carrying on. The problem with this is it assumes the only thing that could go wrong is grep not finding a match, when there are other legitimate exceptions that could occur that we would want to propagate.

If you find yourself writing rescue StandardError, it might be a good idea to ask if it's really necessary for what you're trying to do. Doing so will match any subclass of StandardError, which is the vast majority of errors you'll encounter (the Ruby docs on Exception are good resource). If you truly want to check for the HelperSubprocessFailed exception, rescue that class specifically so that other, unexpected exceptions are propagated correctly.

jeffwidman · 2024-07-03T19:52:05Z

Garry has moved on from the team. The underlying issue still exists, but I suspect whoever tackles this will likely start from a fresh PR, so I'm closing this.

Note that there's also this related PR:

Support Go modules with LFS committed files #10052

Pushing changes in progress

85ad379

GarryHurleyJr requested a review from a team as a code owner May 18, 2024 02:52

Merge branch 'main' into GarryHurleyJr/6039-add-support-in-dependabot…

528b2ad

…-core-for-lfs-as-an-option-for-repositories

GarryHurleyJr changed the title ~~Pushing changes in progress~~ Pushing changes in progress for git-lfs support adding to entire ecosystem. May 21, 2024

bdragon reviewed May 21, 2024

View reviewed changes

GarryHurleyJr added 9 commits May 23, 2024 11:30

Corrected the approach for determining the command string.

b7858c5

cleanup of common files

416f73c

cleanup of common files

1be323c

Git class updates

fed7b21

Pushing changes in progress

859e451

Corrected the approach for determining the command string.

b58fb65

cleanup of common files

00b1f37

cleanup of common files

aa9e64e

Git class updates

e610181

GarryHurleyJr force-pushed the GarryHurleyJr/6039-add-support-in-dependabot-core-for-lfs-as-an-option-for-repositories branch from fed7b21 to e610181 Compare May 29, 2024 15:58

GarryHurleyJr added 4 commits May 29, 2024 11:59

Merge branch 'GarryHurleyJr/6039-add-support-in-dependabot-core-for-l…

bd0ba9c

…fs-as-an-option-for-repositories' of https://github.com/dependabot/dependabot-core into GarryHurleyJr/6039-add-support-in-dependabot-core-for-lfs-as-an-option-for-repositories

sorbet fixes

512226c

sorbet fixes

3880844

sorbet fixes

7278432

abdulapopoola reviewed May 29, 2024

View reviewed changes

sorbet fixes

6ac6646

abdulapopoola reviewed May 29, 2024

View reviewed changes

GarryHurleyJr added 3 commits May 29, 2024 18:49

lint fixes

32cf333

lint fixes

1fd1283

lint fixes

037657b

jurre reviewed May 30, 2024

View reviewed changes

GarryHurleyJr added 3 commits May 30, 2024 08:48

lint fixes

f079add

lint fixes

5f555ef

lint fixes

b5ec97b

bdragon reviewed May 30, 2024

View reviewed changes

jeffwidman closed this Jul 3, 2024

jeffwidman deleted the GarryHurleyJr/6039-add-support-in-dependabot-core-for-lfs-as-an-option-for-repositories branch July 3, 2024 19:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pushing changes in progress for git-lfs support adding to entire ecosystem. #9766

Pushing changes in progress for git-lfs support adding to entire ecosystem. #9766

GarryHurleyJr commented May 18, 2024 •

edited by jeffwidman

Loading

bdragon left a comment •

edited

Loading

bdragon May 21, 2024

bdragon May 21, 2024

GarryHurleyJr commented May 21, 2024

abdulapopoola May 29, 2024

abdulapopoola May 29, 2024

jurre May 30, 2024

jurre May 30, 2024

jurre May 30, 2024

jurre May 30, 2024 •

edited

Loading

jurre May 30, 2024 •

edited

Loading

GarryHurleyJr May 30, 2024

jurre May 30, 2024

bdragon May 30, 2024 •

edited

Loading

jeffwidman commented Jul 3, 2024

	lfs_enabled = is_lfs_enabled(path) if lfs_enabled.nil?
	lfs_enabled = is_lfs_enabled(path) if lfs_enabled.nil?

	lfs_enabled = T.let(true, T::Boolean) if File.exist?(filepath) && File.readable?(filepath) &&
	T.let(true, T::Boolean) if File.exist?(filepath) && File.readable?(filepath) &&

Pushing changes in progress for git-lfs support adding to entire ecosystem. #9766

Pushing changes in progress for git-lfs support adding to entire ecosystem. #9766

Conversation

GarryHurleyJr commented May 18, 2024 • edited by jeffwidman Loading

bdragon left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

GarryHurleyJr commented May 21, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jurre May 30, 2024 • edited Loading

Choose a reason for hiding this comment

jurre May 30, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bdragon May 30, 2024 • edited Loading

Choose a reason for hiding this comment

jeffwidman commented Jul 3, 2024

GarryHurleyJr commented May 18, 2024 •

edited by jeffwidman

Loading

bdragon left a comment •

edited

Loading

jurre May 30, 2024 •

edited

Loading

jurre May 30, 2024 •

edited

Loading

bdragon May 30, 2024 •

edited

Loading